A Scalable System for Embedded Large Vocabulary Continuous Speech Recognition

نویسندگان

  • G. Linarès
  • D. Massonié
  • P. Nocéra
  • C. Lévy
چکیده

This paper presents a system for large vocabulary continuous speech recognition in condition of constrained hardware resources. We investigate efficient pruning and caching strategy aiming to handle extensive acoustic and linguistic modeling. Software components are analyzed in terms of resource consuming. Then, we evaluate the system performance in extreme configuration where acoustic and linguistic models are dramatically pruned. Results show that the system design we proposed allows to use large HMM-based acoustic models and trigram language models while performing very fast decoding, under 0.6 real-time on a standard desktop computer while remaining the transcript relevance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

Parallelized viterbi processor for 5, 000-word large-vocabulary real-time continuous speech recognition FPGA system

We propose a novel Viterbi processor for the large vocabulary real-time continuous speech recognition. This processor is built with multi Viterbi cores. Since each core can independently compute, these cores reduce the cycle times very efficiently. To verify the effect of utilizing multi cores, we implement a dual-core Viterbi processor in an FPGA and achieve 49% cycle-time reduction, compared ...

متن کامل

Profiling large-vocabulary continuous speech recognition on embedded devices: a hardware resource sensitivity analysis

When deployed in embedded systems, speech recognizers are necessarily reduced from large-vocabulary continuous speech recognizers (LVCSR) found on desktops or servers to fit the limited hardware. However, embedded hardware continues to evolve in capability; today’s smartphones are vastly more powerful than their recent ancestors. This begets a new question: which hardware features not currently...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007